KNN Regression as Geo-Imputation Method for Spatio-Temporal Wind Data
نویسندگان
چکیده
The shift from traditional energy systems to distributed systems of energy suppliers and consumers and the power volatileness in renewable energy imply the need for e↵ective short-term prediction models. These machine learning models are based on measured sensor information. In practice, sensors might fail for several reasons. The prediction models cannot naturally cannot work properly with incomplete patterns. If the imputation method, which completes the missing data, is not appropriately chosen, a bias may be introduced. The objective of this work is to propose the k-nearest neighbor (kNN) regression as geo-imputation preprocessing step for pattern-labelbased short-term wind prediction of spatio-temporal wind data sets. The approach is compared to three other methods. The evaluation is based on four turbines with neighbors of the NREL Western Wind Data Set and the values are missing uniformly distributed. The results show that kNN regression is the most superior method for imputation.
منابع مشابه
KNN Classification and Regression using SAS
K-Nearest Neighbor (KNN) classification and regression are two widely used analytic methods in predictive modeling and data mining fields. They provide a way to model highly nonlinear decision boundaries, and to fulfill many other analytical tasks such as missing value imputation, local smoothing, etc. In this paper, we discuss ways in SAS R © to conduct KNN classification and KNN Regression. S...
متن کاملClassification of Efficient Imputation Method for Analyzing Missing Values
In Statistical analysis, missing data is a common problem for data quality. Many real datasets have missing data. Imputation preserves all cases by replacing missing data with a probable value based on other available information. Once all missing values have been imputed, the data set can be analyzed using standard techniques for complete data. This paper aim is to describe the efficient imput...
متن کاملEvaluation of Missing Value Estimation for Microarray Data
Microarray gene expression data contains missing values (MVs). However, some methods for downstream analyses, including some prediction tools, require a complete expression data matrix. Current methods for estimating the MVs include sample mean and K-nearest neighbors (KNN). Whether the accuracy of estimation (imputation) methods depends on the actual gene expression has not been thoroughly inv...
متن کاملSpatio-temporal analysis of diurnal air temperature parameterization in Weather Stations over Iran
Diurnal air temperature modeling is a beneficial experimental and mathematical approach which can be used in many fields related to Geosciences. The modeling and spatio-temporal analysis of air Diurnal Temperature Cycle (DTC) was conducted using data obtained from 105 synoptic stations in Iran during the years 2013-2014 for the first time; the key variable for controlling the cosine term i...
متن کاملCombining kNN Imputation and Bootstrap Calibrated: Empirical Likelihood for Incomplete Data Analysis
The k-nearest neighbor (kNN) imputation, as one of the most important research topics in incomplete data discovery, has been developed with great successes on industrial data. However, it is difficult to obtain a mathematical valid and simple procedure to construct confidence intervals for evaluating the imputed data. This chapter studies a new estimation for missing (or incomplete) data that i...
متن کامل